Hybrid-parallel sparse matrix-vector multiplication with explicit communication overlap on current multicore-based systems

نویسندگان

Gerald Schubert

Holger Fehske

Georg Hager

Gerhard Wellein

چکیده

We evaluate optimized parallel sparse matrix-vector operations for several representative application areas on widespread multicore-based cluster configurations. First the single-socket baseline performance is analyzed and modeled with respect to basic architectural properties of standard multicore chips. Beyond the single node, the performance of parallel sparse matrix-vector operations is often limited by communication overhead. Starting from the observation that nonblocking MPI is not able to hide communication cost using standard MPI implementations, we demonstrate that explicit overlap of communication and computation can be achieved by using a dedicated communication thread, which may run on a virtual core. Moreover we identify performance benefits of hybrid MPI/OpenMP programming due to improved load balancing even without explicit communication overlap. We compare performance results for pure MPI, the widely used “vector-like” hybrid programming strategies, and explicit overlap on a modern multicore-based cluster and a Cray XE6 system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Achieving Efficient Strong Scaling with PETSc Using Hybrid MPI/OpenMP Optimisation

The increasing number of processing elements and decreasing memory to core ratio in modern high-performance platforms makes efficient strong scaling a key requirement for numerical algorithms. In order to achieve efficient scalability on massively parallel systems scientific software must evolve across the entire stack to exploit the multiple levels of parallelism exposed in modern architecture...

متن کامل

Efficient Multicore Sparse Matrix-Vector Multiplication for Finite Element Electromagnetics on the Cell-BE processor

Multicore systems are rapidly becoming a dominant industry trend for accelerating electromagnetics computations, driving researchers to address parallel programming paradigms early in application development. We present a new sparse representation and a two level partitioning scheme for efficient sparse matrix-vector multiplication on multicore systems, and show results for a set of finite elem...

متن کامل

Prospects for Truly Asynchronous Communication with Pure MPI and Hybrid MPI/OpenMP on Current Supercomputing Platforms

We investigate the ability of MPI implementations to perform truly asynchronous communication with nonblocking point-to-point calls on current highly parallel systems, including the Cray XT and XE series. For cases where no automatic overlap of communication with computation is available, we demonstrate several different ways of establishing explicitly asynchronous communication by variants of ...

متن کامل

Hybrid-Parallel Sparse Matrix–Vector Multiplication and Iterative Linear Solvers with the communication library GPI

We present a library of Krylov subspace iterative solvers built over the PGAS-type communication layer GPI. The hybrid pattern is here the appropriate choice to reveal the hierarchical parallelism of clusters with multiand manycore nodes. Our approach includes asynchronous communication and differs in many aspects from the classical one. We first present the GPI-based implementation of the spar...

متن کامل

Performance limitations for sparse matrix-vector multiplications on current multicore environments

The increasing importance of multicore processors calls for a reevaluation of established numerical algorithms in view of their ability to profit from this new hardware concept. In order to optimize the existent algorithms, a detailed knowledge of the different performance-limiting factors is mandatory. In this contribution we investigate sparse matrix-vector multiplication, which is the domina...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Parallel Processing Letters

دوره 21 شماره

صفحات -

تاریخ انتشار 2011

Hybrid-parallel sparse matrix-vector multiplication with explicit communication overlap on current multicore-based systems

نویسندگان

چکیده

منابع مشابه

Achieving Efficient Strong Scaling with PETSc Using Hybrid MPI/OpenMP Optimisation

Efficient Multicore Sparse Matrix-Vector Multiplication for Finite Element Electromagnetics on the Cell-BE processor

Prospects for Truly Asynchronous Communication with Pure MPI and Hybrid MPI/OpenMP on Current Supercomputing Platforms

Hybrid-Parallel Sparse Matrix–Vector Multiplication and Iterative Linear Solvers with the communication library GPI

Performance limitations for sparse matrix-vector multiplications on current multicore environments

عنوان ژورنال:

اشتراک گذاری